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AMENDMENTS TO THE CLAIMS 

The following listing of claims replaces all prior claims and all prior listings of 
claims in this application. 
1-27: (canceled) 

28. (new) A method facilitated by a human annotator and perfomied in a 
computer environment for normalizing a score associated with a document, the method 
comprising the steps of: 

(a) establishing (1) through the computer environment a set of training documents 
most of which are believed not to be relevant to a topic (off-topic) and (2) through the 
human annotator a query relevant to the topic (on-topic); 

(b) assigning, through the computer environment, a training document relevance 
score to each one of the training documents, each training document relevance score 
representing a measure of relevance of its respective document to the topic; 

(c) determining, through the computer environment, statistics relating to all 
training document relevance scores and thereby obtaining determined statistics; 

(d) receiving a testing document; 

(e) calculating, through the computer environment, a score of relevance of the 
testing document to the topic to obtain a testing document relevance score; 

(0 normalizing, through the computer environment and based on the statistics, the 
testing document relevance score to obtain a normalized score wherein: 
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normalizing adjusts the testing document relevance score based on 
the statistics to be comparable to other scores from which the statistics 
were determined, and 

the normalized score is a better predictor of probability of the 
testing document being relevant than the testing document relevant score; 

(g) establishing, through the computer environment, a threshold score 
representing a relevance threshold for the topic; 

(h) comparing the normalized score to the threshold score to obtain a comparison; 

and 

(i) designating the testing document as relevant or not relevant to the topic 
based on the comparison. 

29. (new) The method of claim 28, wherein the statistics include a mean score of 
the training documents not relevant to the topic and a standard deviation of the scores 
assigned to the set of training documents not relevant to the topic. 

30. (new) The method of claim 29, wherein said normalizing step determines the 
normalized score according the following formula: 

normalized_score = (s - ^off.topic) / CTofUopic 

wherein s represents the score assigned to the testing document, fXoff.topic 
represents the mean score of the documents not relevant to the topic, and cyoff_topic 
represents the standard deviation of the scores assigned to the set of training documents 
not relevant to the topic. 
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31. (new) The method of claim 28, said determining step further comprising: 
determining statistics relating to scores assigned to a set of training documents 

relevant to the topic. 

32. (new) The method of claim 31, wherein the statistics relating to scores 
assigned to a set of training documents relevant to the topic include a mean score of the 
documents relevant to the topic and a standard deviation of the scores assigned to the set 
of training documents relevant to the topic. 

33. (new) The method of claim 32, wherein said normalizing step comprises: 
normalizing a score assigned to a testing document based on the statistics relating 

to the scores assigned to the set of training documents not relevant to the topic and based 
on the statistics relating to the scores assigned to the set of training documents relevant to 
the topic. 

34. (new) The method of claim 33, wherein said normalizing step determines the 
normalized score according the following formula: 

normalized_score =/on-topic * ((s - ^off^topic) / CToff.topic) 

wherein /cm-topic represents a scale factor based on the statistics relating to the 
scores assigned to the set of training documents relevant to the topic, s represents the 
score assigned to the testing document, |uioff_topic represents the mean score of the 
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documents not relevant to the topic, and aofoopic represents the standard deviation of the 
scores assigned to the set of training documents not relevant to the topic. 

35. (new) The method of claim 28, wherein said designating step comprises: 
designating the testing document as relevant to the topic based on a determination 

that the normalized score is greater than the threshold score; and 

designating the testing document as not relevant to the topic based on a 
determination that the normalized score is not greater than the threshold score. 

36. (new) The method of claim 28, further comprising: 
repeating steps (a)-(d) for a plurality of topics. 

37. (new) The method of claim 28, further comprising: 
repeating steps (a)-(d) for a plurality of testing documents. 

38. (new) The method of claim 28, wherein the statistics include a first robust 
estimate of a mean score of the set of training documents not relevant to the topic and a 
second robust estimate of a standard deviation of the scores assigned to the set of training 
documents not relevant to the topic. 

39. (new) The method of claim 38 wherein the first robust estimate 
comprises: 

setting a first robust estimate threshold, based on the determined statistics; 
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removing each document of the set of training documents not relevant to the topic 
which is above the first robust estimate threshold, thereby creating a remaining set of 
training documents not relevant to the topic; and 

determining new estimates of the statistics that may be more appropriate for off- 
topic documents based solely on the remaining set of training documents. 

40. (new) The method of claim 38 wherein the second robust estimate 
comprises: 

setting a second robust estimate threshold, based on the determined statistics; 

removing each document of the set of .training documents not relevant to the topic 
which is above the second robust estimate threshold, thereby creating a remaining set of 
training documents not relevant to the topic; and 

determining new estimates of the statistics that may be more appropriate for off- 
topic documents based solely on the remaining set of training documents. 

41. (new) The method of claim 39 wherein the second robust estimate 
comprises: 

setting a second robust estimate threshold, based on the determined statistics; 

removing each document of the set of training documents not relevant to the topic 
which is above the second robust estimate threshold, thereby creating a remaining set of 
training documents not relevant to the topic; and 

determining new estimates of the statistics that may be more appropriate for off- 
topic documents based solely on the remaining set of training documents. 
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42. (new) A computer-readable medium containing instructions for performing a 
method for normalizing a score associated with a document, the method facilitated by a 
human annotator comprising: 

(a) establishing (1) through the computer-readable medium a set of training 
documents most of which ^e believed not to be relevant to a topic (off-topic) and (2) 
through the human annotator a query relevant to the topic (on-topic); 

(b) assigning, through the computer environment, a training document relevance 
score to each one of the training documents, each training document relevance score 
representing a measure of relevance of its respective document to the topic; 

(c) determining, through the computer environment, statistics relating to all 
training document relevance scores; 

(d) receiving a testing document; 

(e) calculating, through the computer environment, a score of relevance of the 
testing document to the topic to obtain a testing document relevance score; 

(f) normalizing, through the computer environment and based on the statistics, the 
testing document relevance score to obtain a normalized score wherein: 

normalizing adjusts the testing document relevance score based on 
the statistics to be comparable to other scores from which the statistics 
were determined, and 

the normalized score is a better predictor of probability of the 
testing document being relevant than the testing document relevant score; 

(g) establishing, through the computer environment, a threshold score 
representing a relevance threshold for the topic; 
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(h) comparing the normalized score to the threshold score to obtain a comparison; 

and 

(i) designating the testing document as relevant or not relevant to the topic 
based on the comparison. 



43. (new) The computer-readable medium of claim 42, wherein the statistics 
include a mean score of the training documents not relevant to the topic and a standard 
deviation of the scores assigned to the set of training documents not relevant to the topic. 

44. (new) The computer-readable medium of claim 43, wherein said normalizing 
step determines the normalized score according the following formula: 

normalized_score = (s - lutofLtopic) / aoff_topic 

wherein s represents the score assigned to the testing document, M.off_topic 
represents the mean score of the documents not relevant to the topic, and aoff_topic 
represents the standard deviation of the scores assigned to the set of training documents 
not relevant to the topic. 

45. (new) The computer-readable medium of claim 42, said determining step 
further comprising: 

determining statistics relating to scores assigned to a set of training documents 
relevant to the topic. 
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46. (new) The computer-readable medium of claim 45, wherein the statistics 
relating to scores assigned to a set of training documents relevant to the topic include a 
mean score of the documents relevant to the topic and a standard deviation of the scores 
assigned to the set of training documents relevant to the topic. 

47. (new) The computer-readable medium of claim 46, wherein said normalizing 
step comprises: 

normalizing a score assigned to a testing document based on the statistics relating 
to the scores assigned to the set of training documents not relevant to the topic and based 
on the statistics relating to the scores assigned to the set of training documents relevant to 
the topic. 

48. (new) The computer-readable medium of claim 47, wherein said normalizing 
step determines the normalized score according the following formula: 

normalized^score =/on-topic * ((s - |iioff_topic) / cioff^topic) 

wherein /on-topic represents a scale factor based on the statistics relating to the 
scores assigned to the set of training documents relevant to the topic, s represents the 
score assigned to the testing document, jioff_topic represents the mean score of the 
documents not relevant to the topic, and aoff_topic represents the standard deviation of the 
scores assigned to the set of training documents not relevant to the topic. 

49. (new) The computer-readable medium of claim 42, wherein said designating 
step comprises: 
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designating the testing document as relevant to the topic based on a determination 
that the normahzed score is greater than the threshold score; and 

designating the testing document as not relevant to the topic based on a 
determination that the normalized score is not greater than the threshold score. 

50. (new) The computer-readable medium of claim 42, further comprising: 
repeating steps (a)-(d) for a plurality of topics. 

51. (new) The computer-readable medium of claim 42, further comprising: 
repeating steps (a)-(d) for a plurality of testing documents. 

52. (new) The computer-readable medium of claim 42, wherein the statistics 
include a robust estimate of a mean score of the training documents not relevant to the 
topic and a robust estimate of a standard deviation of the scores assigned to the set of 
training documents not relevant to the topic. 

53. (new) A method, facilitated by a human annotator and performed in a 
computer environment for normalizing a score associated with a document, the method 
comprising the steps of: 

(a) receiving (1) through the computer environment a set of training documents 
not relevant to a topic (off-topic) and (2) through the human annotator a query including 
the topic (on-topic); 
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(b) assigning, through the computer environment, a training document relevance 
score to each one of the training documents, each training document relevance score 
representing a measure of relevance of its respective document to the topic; 

(c) determining, through the computer environment, statistics relating to all 
training document relevance scores; 

(d) receiving a testing document; 

(e) calculating, through the computer environment, a score of relevance of the 
testing document to the topic to obtain a testing document relevance score; and 

(f) normalizing, through the computer environment and based on the statistics, the 
testing document relevance score to obtain a normalized score wherein: 

normalizing adjusts the testing document relevance score based on 
the statistics to be comparable to other scores from which the statistics 
were determined, and 

the normalized score is a better predictor of probability of the 
testing document being relevant than the testing document relevant score. 

54. (new) The method of claim 53, further comprising: 

designating the testing document as relevant or not relevant to the topic based on 
the normalized score. 

55. (new) The method of claim 53, further comprising: 
comparing the normalized score to a threshold score; and 
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designating the testing document as relevant or not relevant to the topic based on 
the comparison. 

56. (new) A method facilitated by a human annotator and performed by a 
processor in a computer environment for searching for documents relevant to a topic 
comprising the steps of: 

establishing through the computer environment a set of training documents not 
relevant to the topic (off-topic); 

the human annotator sending a query including the topic (on-topic) to the 
processor; and 

the human annotator receiving results from the processor indicating a document 
relevant to the topic, wherein the processor: 

assigns, through the computer environment, a training document relevance 
score to each one of the training documents, each training document relevance 
score representing a measure of relevance of its respective document to the topic; 

determines, through the computer environment, statistics relating to all 
training document relevance scores; 

receives a testing document; 

calculates, through the computer environment, a score of relevance of the 
testing document to the topic to obtain a testing document relevance score; 

normalizes, through the computer environment and based on the statistics, 
the testing document relevance score to obtain a normalized score wherein: 
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normalizing adjusts the testing document relevance score based on 
the statistics to be comparable to other scores from which the statistics 
were determined, and 

the normalized score is a better predictor of probability of the 
testing document being relevant than the testing document relevant score; 
establishes, through the computer environment, a threshold score 
representing a relevance threshold for the topic; 

compares the normalized score to the threshold score to obtain a 
comparison; and 

designates the testing document as relevant or not relevant to the topic 
based on the comparison. 
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